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Let Sn be the permutation group on n elements, and consider 
a random walk on Sn whose step distribution is uniform on fc-cycles. 
We prove a well-known conjecture that the mixing time of this pro- 
cess is (l/fc)nlogn, with threshold of width linear in n. Our proofs 
are elementary and purely probabilistic, and do not appeal to the 
representation theory of iS„. 

1. Introduction. 

1.1. Main result. Let be the group of permutations of {1, n}. Any 
permutation cr G 5„ has a unique cycle decomposition, which partitions the 
set {1, . . . ,n} into orbits under the natural action of a. The cycle structure 
of a is the integer partition of n associated with this set partition, in other 
words, the ordered sizes of the cycles (blocks of the partition) ranked in 
decreasing size. It is customary not to include the fixed points of a in this 
structure. For instance, the permutation 

_(l 2 3 4 5 6 7\ 
^~V4 267351; 

has 3 cycles, (1 4 7)(2)(3 6 5), so its cycle structure is (3,3) (and one 
fixed point which does not appear in this structure). A conjugacy class 
T d Sn is the set of permutations having a given cycle structure. Let |r| 



Received November 2010. 

^Supported in part by EPSRC Grant EP/GO55068/1. 

^Supported in part by NSF Grant DMS-08-04133 and by a grant from the Israel Science 
Foundation. 

AMS 2000 subject classifications. 60B15, 60J27. 

Key words and phrases. Mixing times, coalescence, cutoff phenomena, random cycles, 
random transpositions. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Probability, 
2011, Vol. 39, No. 5, 1815-1843. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



N. BERESTYCKI, O. SCHRAMM AND O. ZEITOUNI 



denote the support of F, that is, the number of nonfixed-points of any per- 
mutation cr G r. In what fohows we deal with the case where T consists of 
a single fc-cycle, in which case |r| = k (see, however. Remark 2). It is well 
known and easy to see that in this case, if k is even, then T generates 5„, 
while if A; > 2 is odd, then T generates the alternate group An of even per- 
mutations. Let {TTt,t > 0) be the continuous-time random walk associated 
with (5n,r). That is, let 71,72, ... be a sequence of i.i.d. elements uniformly 
distributed on T, and let {Nt,t > 0) be an independent Poisson process with 
rate 1; then we take 

(1) 7rt = ^^o-.-o7^^, 

where 707' indicates the composition of the permutations 7 and 7'. (7rt,t > 
0) is a Markov chain on 5„ which converges to the uniform distribution fi 
on Sn when |r| is even, and to the uniform distribution on An when |r| > 2 
is odd. In any case we shall write n for that limiting distribution. We shall 
be interested in the mixing properties of this process as n — )• 00, as measured 
in terms of the total variation distance. Let pt{-) be the distribution of nt 
on Sn, and let /u be the invariant distribution of the chain. Let 



where d{t) is the total variation distance between the state of the chain at 
time t and its limiting distribution /i. (Below, we will also use the notation 
||X — y II where X and Y are collections of random variables with laws px,PY 
to mean \\px —Py\\-) 

The main goal of this paper is to prove that the chain exhibits a sharp 
cutoff, in the sense that d{t) drops abruptly from its maximal value 1 to 
its minimal value around a certain time tmix; called the mixing time of 
the chain. (See [6] or [11] for a general introduction to mixing times.) Note 
that if r is a fixed conjugacy class of Sn and m> n, T can also be consid- 
ered a conjugacy class of Sm by simply adding m — n fixed points to any 
permutation cr G F. With this in mind, our theorem states the following: 

Theorem 1. Let k>2 be an integer, and let be the conjugacy class 
of Sn corresponding to k-cycles. The continuous time random walk {irt,t > 0) 
associated with {Sn,^k) has a cutoff at time tmix := (l/A;)?ilogn, in the sense 
that for any e > 0, there exist N^^k,Cs,k > large enough so that for all 



d{t) = \\pti-) - /^ll = 2 E l^'*^^) - /^(^)l' 



(2) 
(3) 



d(tmix - Cs^kn) >l-e, 
+ Ce,kn) < e. 



As explained in Section 1.2 below, this result solves a well-known conjec- 
ture formulated by several people over the course of the years. 
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Remark 2. Theorem 1 can be extended, without a significant change 
in the proofs, to cover the case of general fixed conjugacy classes T, with 
A; = |r| > 2 independent of n. In order to alleviate notation, we present 
here only the proof for A;-cycles. A more delicate question, that we do not 
investigate, is what growth of k = k(n) is allowed so that Theorem 1 would 
still be true in the form 



The lower bound in (4) is easy. For the upper bound in (5), due to the birth- 
day problem, the case k = o{y/n) should be fairly similar to the arguments 
we develop below, with adaptations in several places, for example, in the 
argument following (32); we have not checked the details. Things are likely 
to become more delicate when k is of order ^/n or larger. Yet, we conjecture 
that (5) holds as long as, k = o{n). 

1.2. Background. This problem has a rather long history, which we now 
sketch. Mixing times of Markov chains were studied independently by Al- 
dous [1] and by Diaconis and Shahshahani [7] at around the same time, in 
the early 1980s. Diaconis and Shahshahani [7], in particular, establish the 
existence of what has become known as the cutoff phenomenon for the com- 
position of random transpositions. Random transpositions is perhaps the 
simplest example of a random walk on 5„ and is a particular case of the 
walks covered in this paper, arising when the conjugacy class T contains ex- 
actly all transpositions. The authors of [7] obtained a version of Theorem 1 
for this particular case (with explicit choices of C2,£ for a given e). As is the 
case here, the hard part of the result is the upper-bound (3). Remarkably, 
their solution involved a connection with the representation theory of Sn, 
and uses rather delicate estimates on so-called character ratios. 

Soon afterwards, a flurry of papers tried to generalize the results of [7] in 
the direction we are taking in this paper, that is, when the step distribution is 
uniform over a fixed conjugacy class F. However, the estimates on character 
ratios that are needed become harder and harder as |F| increases. Flatto, 
Odlyzko and Wales [9], building on earlier work of Vershik and Kerov [21], 
obtained finer estimates on character ratios and were able to show that 
mixing must occur before (l/2)nlogn for |F| fixed, thus giving another proof 
of the Diaconis-Shahshahani result when |F| = 2. (Although this does not 
appear explicitly in [9], it is recounted in Diaconis's book [6], page 44.) 
Improving further the estimates on character ratios, Roichman [14, 15] was 
able to prove a weak version of Theorem 1, where it is shown that d{f) is 
small if t > Ctmix for some large enough C > 0. In his result, |F| is allowed 
to grow to infinity as fast as (1 — 5)n for any 5 > 0. To our knowledge, it is 



(4) 
(5) 



(^(imix(l - (5)) > 1 - e, 
d(imix(l + (^))<e? 
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in [15] that Theorem 1 first formally appears as a conjecture, although we 
have no doubt that it had been privately made before. (The lower bound 
for random transpositions, which is based on counting the number of fixed 
points in ttj, works equally well in this context and provides the conjectured 
correct answer in all cases.) Lulov [13] dedicated his Ph.D. thesis to the 
problem, and Lulov and Pak [12] obtained a partial proof of the conjecture 
of Roichman, in the case where |r| is very large, that is, greater than n/2. 
More recently, Roussel [16] and [17] made some progress in the small |r| 
case, working out the character ratios estimates to treat the case where 
|r| < 6. Saloff-Coste, in his survey article ([18], Section 9.3) discusses the 
sort of difficulties that arise in these computations and states the conjecture 
again. A summary of the results discussed above is also given. See also [19], 
page 381, where work in progress of Schlage-Puchta that overlaps the result 
in Theorem 1 is mentioned. 

1.3. Structure of the proof. To prove Theorem 1, it suffices to look at the 
cycle structure of vrt and check that if Nt{i) is the number of cycles of ttj of 
size i for every i > 1, and if t > imix + Cfc^^n then the total variation distance 
between {Nt{i))i<i<n and (A^(«))i<t<n is close to 0, where (iV(i))i<j<n is the 
cycle distribution of a random permutation sampled from fx. We thus study 
the dynamics of the cycle distribution of vrt, which we view as a certain 
coagulation-fragmentation chain. Using ideas from Schramm [20], it can be 
shown that large cycles are at equilibrium much before tmix, that is, at a time 
of order 0{n). Very informally speaking, the idea of the proof is the following. 
We focus for a moment on the case = 2 of random transpositions, which is 
the easiest to explain. The process (vr^, t>0) may be compared to an Erdos- 
Renyi random graph process {Gt,t > 0) where random edges are added to 
the graph at rate 1, in such a way that the cycles of the permutation are 
subsets of the connected components of Gt- Schramm's result from [20] then 
says that, \it = cn with c > 1/2 (so that Gt has a giant component), then the 
macroscopic cycles within the giant component have relaxed to equilibrium. 
By an old result of Erdos and Renyi, it takes time t = tmix + C'fc,e'T- for Gt 
to be connected with probability greater than 1 — e. By this point the giant 
component encompasses every vertex and thus, extrapolating Schramm's 
result to this time, the macroscopic cycles of Tit have the correct distribution 
at this point. A separate and somewhat more technical argument is needed 
to deal with small cycles. 

More formally, the proof of Theorem 1 thus proceeds in two main steps. In 
the first step, presented in Section 2 and culminating in Proposition 18, we 
show that after time tmix + Ce,k'n, the distribution of small cycles is close (in 
variation distance) to the invariant measure, where a small cycle means that 
it is smaller than a suitably chosen threshold approximately equal to n'^/^. 
This is achieved by combining a queueing-system argument (whereby initial 
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discrepancies are cleared by time slightly larger than tmix and equilibrium 
is achieved) with a priori rough estimates on the decay of mass in small 
cycles (Section 2.1). In the second step, contained in Section 3, a variant 
of Schramm's coupling from [20] is presented, which allows us to couple the 
chain after time tmix + c^^fcra to a chain started from equilibrium, within time 
of order n^^^logn, if all small cycles agree initially. 

2. Small cycles. In this section we prove the following proposition. Let 
(iVj(t))i<j<„ be the number of cycles of size i of the permutation vrt, where 
(7rt,i > 0) evolves according to random /c-cycles (where A: > 2), but does not 
necessarily start at the identity permutation. Let (Zi)^^-^ denote independent 
Poisson random variables with mean 

Fix < X < 1 and let K = K{n) be the closest dyadic integer to n^. 
We think of cycles smaller than K as being small, and big otherwise. Let 
Ij={ieZ:ie [2J, 2^+1)}, Lj = \Ij\ = 23 and 

(6) M,{i) = Y,N,{i). 
Introduce the stopping time 

(7) T = inf{i > : 30 < j < logg K + 1, Mj{t) > (logn)V2}. 

Therefore, prior to r, the total number of small cycles in each dyadic strip 
[2^, 2^+^) (j < 1 + log2i^) never exceeds (logn)V2. 

Proposition 3. Suppose that 

(8) P(r<nlogn) — >0 
as n — )• oo, and that initially, 

(9) Mj(0)<Dlog(i + 2) 

for all < j < log2logn, for some D > independent of j or n. Then for 
any sequence t = t{n) such that t{n)/n — )• oo as oo and t{n) < nlogn, 

\m{t))i,-{z,)i,\\^o. 

In particular, under the assumptions of Proposition 3, for any e > there 
is a Cg^fc > such that for all n large, 

\\{N,{c,,kn))l,-iZ,)l,\\<e. 

In Sections 2.1 and 2.4, Proposition 3 is applied to the chain after time 
roughly tmix = (nlogn)/A;, at which point the initial conditions Mj(0) sat- 
isfy (9) (with high probability). 
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Proof of Proposition 3. The proof of this proposition rehes on the 
analysis of the dynamics of the small cycles, where each step of the dynamics 
corresponds to an application of a fc-cycle, by viewing it as a coagulation- 
fragmentation process. To start with, note that every fe-cycle may decom- 
posed as a product of /c — 1 transpositions 

C={Xk,...,Xi) = {Xk,Xk_i)---{x2,Xl). 

Thus the application of a fc-cycle may be decomposed into the application 
of /c — 1 transpositions: namely, applying c is the same as first applying the 
transposition (xi,X2) followed by {x2,X'i) and so on until {xk-i-,Xk)- When- 
ever one of those transpositions is applied, say (a, 6), this can yield either 
a fragmentation or a coagulation, depending on whether a and h are in the 
same cycle or not at this time. If they are, say if 6 = cT*(a) (where i > 1 and a 
denotes the permutation at this time), then the cycle C containing a and h 
splits into {a, . . . ,a^~^{a)) and everything else, that is, (6, . . . ,(7l'^l~''(6)). If 
they are in different cycles C and C then the two cycles merge. 

To track the evolution of cycles, we color the cycles with different colors 
(blue, red or black) according (roughly) to the following rules. The blue 
cycles will be the large ones, and the small ones consist of red and black. 
Essentially, red cycles are those which undergo a "normal" evolution, while 
the black ones are those which have experienced some kind of error. By 
"normal evolution," we mean the following: in a given step, one small cycle 
is generated by fragmentation of a blue cycle. It is the first small cycle that is 
involved in this step. In a later step of the random walk, this cycle coagulates 
with a large cycle and thus becomes large again. If at any point of this story, 
something unexpected happens (e.g., this cycle gets fragmented instead of 
coagulating with a large cycle, or coagulates with another small cycle) we 
will color it black. In addition, we introduce ghost cycles to compensate for 
this sort of error. 

We now describe this procedure more precisely. We start by coloring every 
cycle of the permutation a{t) which is larger than K blue. We denote by 6[t) 
the fraction of mass contained in blue cycles, that is, 

1 " 

(10) e{t) = - mt)- 

i=K+l 

Note that by definition of r, 

(11) l-—(lognf<e{t)<l 

n 

for all t <T. 

We now color the cycles which are smaller than K either red or black 
according to the following dynamics. Suppose we are applying a certain 
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/c-cycle c = {xk, . . . , xi), which we write as a product of k — 1 transpositions 

(12) c={xk,...,Xi) = (Xfc, Xfc_l) • • • {X2,xi) 

(note that we require that Xj ^ Xj for j). 

Red cycles. Assume that a blue cycle is fragmented and one of the pieces 
is small, and that this transposition is the first one in the application of 
the A:-cycle (xi,...,Xfc) to involve a small cycle. In that case (and only in 
that case), we color it red. Red cycles may depart through coagulation or 
fragmentation. A coagulation with a blue cycle, if it is the first in the step 
and no small cycles were created in this step prior to it, will be called lawful. 
Any other departure will be called unlawful. If a blue cycle breaks up in a way 
that would create a red cycle and both cycles created are small (which may 
happen if the size of the cycle is between K and 2K), then we color the 
smaller one red and the larger one black, with a random rule in the case of 
ties. 

Black cycles. Black cycles are created in one of two ways. First, any red 
cycle that departs in an unlawful fashion and stays small becomes black. 
Further, if the transposition (a, b) is not the first transposition in this step 
to create a small cycle from a blue cycle, or if it is but a previous transpo- 
sition in the step involved a small cycle, then the small cycle(s) created is 
colored black. Now, assume that (a, b) involves only cycles which are smaller 
than K: this may be a fragmentation producing two new cycles, or a merg- 
ing of two cycles producing one new cycle. In this case, we color the new 
cycle(s) black, no matter what the initial color of the cycles, except if this 
operation is a coagulation and the size of this new cycle exceeds K, in which 
case it is colored blue again. Thus, black cycles are created through either 
coagulations of small parts or fragmentation of either small or large parts, 
but black cycles disappear only through coagulation. 

We aim to analyze the dynamics of the red and black system, and the 
idea is that the dynamics of this system are essentially dominated by that 
of the red cycles, where the occurrence of black cycles is an error that we 
aim to control. 

Ghosts. Let Ri{t), Bi{t) be the number of red and black cycles, respec- 
tively, of size i at time t. It will be helpful to introduce another type of cycle, 
called ghost cycles, which are nonexisting cycles which we add for counting 
purposes: the point is that we do not want to touch more than one red cycle 
in any given step. Thus, for any red cycle departing in an unlawful way, we 
compensate it by creating a ghost cycle of the same size. For instance, sup- 
pose two red cycles Ci and C2 coagulate (this could form a blue or a black 
cycle). Then we leave in the place of Ci and C2 two ghost cycles C[ and C2 
of sizes identical to Ci and C2. 
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Table 1 

Coloring algorithm for small cycles, and creation of ghost cycles 

• (I) If the transposition is a fragmentation, go to (F); otherwise, go to (C). 

• (F) If the fragmentation is of a small cycle c of length £, go to (FS); otherwise, go to 
(FL). 

• (FS) Color the resulting small cycles black. Create a ghost cycle of length £, except if c 
was created in the previous transposition of the current step and is red. Finish. 

• (FL) If the fragmentation creates one or two small cycles, and this transposition is the 
first in the step to either create or involve a small cycle, color the smallest small cycle 
created red. All other small cycles created are colored black. Do not create ghost cycles. 
Finish. 

• (C) If the coagulation involves a blue cycle, go to (CL); otherwise, go to (CS). 

• (CL) If the blue cycle coagulates with a red cycle, and this is not the first transposition 
in the step that involves a small cycle, then create a ghost cycle; otherwise, do not create 
a ghost cycle. Finish. 

• (CS) If a small cycle remains after the coagulation, it is colored black. If the coagulation 
involved two red cycles of size i and i' , create two ghost cycles of sizes i and £' , unless 
one of these two red cycles (say of size £') was created in the current step, in which case 
create only one ghost cycle of size £. Finish. 

In addition to this description, all ghost cycles are killed instantaneously at rate ij,{t) 
defined in (17). 

An exception to this rule is that if, during a step, a transposition creates 
a small red cycle by fragmentation of a blue cycle, and later within the 
same step this red cycle either is immediately fragmented again in the next 
transposition or coagulates with another red or black cycle and remains 
small, then it becomes black as above but we do not leave a ghost in its 
place. 

Finally, we also declare that every ghost cycle of size i is killed indepen- 
dently of anything else at an instantaneous rate which is precisely given 
by ifJ-{t), where is a random nonnegative number (depending on the 
state of the system at time t) which will be defined below in (17) and cor- 
responds to the rate of lawful departures of red cycles. 

To summarize, we begin at time with all large cycles colored blue and 
all small cycles colored red. For every step consisting of k transpositions, we 
run the following algorithm for the coloring of small cycles and creation of 
ghost cycles (see Table 1). 

Let Gi{t) denote the number of ghost cycles of size i at time t, and let 
Yi = Ri + Gi, which counts the number of red and ghost cycles of size i. 
Our goal is twofold. First, we want to show that {Yi{t))^^ is close in total 
variation distance to {Zi)f^^ and second, that at time t = t{n) the probability 
that there is any black cycle or a ghost cycle converges to as n — )• oo. 

Remark 4. Note that with our definitions, at each step at most one 
red cycle can be created, and at most one red cycle can disappear without 



MIXING TIMES FOR RANDOM /C-CYCLESS 



9 



being compensated by the creation of a ghost. Furthermore these two events 
cannot occur in the same step. 

Lemma 5. Assume (8) as well as (9), and let t = t{n) be as in Proposi- 
tion 3. Then 

\\[Y,{t))g,-(Z,)g,\\^0. 

Proof. The idea is to observe that Yi has approximately the following 
dynamics: 

rate: (x — )• x + 1) = A, if x > 0, 

rate: (rr — )• x — 1) = ixfi, if x > 1, 

and that A = /i = k/n + o(l/n), so that {¥{) is approximately a system of 
M/M/oo queues where the arrival rate is k/n and the departure rate of every 
customer is ik/n. The equilibrium distribution of (Yi) is thus approximately 
Poisson with parameter the ratio of the two rates, that is, 1/i. The number 
of initial customers in the queues is, by assumption (8), small enough so that 
by time t{n) they are all gone, and thus the queue has reached equilibrium. 

We now make this heuristics precise. To increase Yi by 1, that is, to 
create a red cycle, one needs to specify the jth transposition, I < j < k — 1, 
of the /c-cycle at which it is created. The first point xi of the fc-cycle must 
fall somewhere in a blue cycle (which has probability 6). Say that xi S Ci, 
with Ci a blue cycle. In order to create a cycle of size exactly i at this 
transposition, the second point X2 must fall at either of exactly two places 
within Ci: either cr*(xi) or (T~*(xi). However, note that if X2 =a~^{xi) and 
|c| = k>3, then the next transposition is guaranteed to involve the newly 
formed cycle, either to reabsorb it in the blue cycles, or to turn into a black 
cycle through coalescence with another small cycle or fragmentation. Either 
way, this newly formed cycle does not eventually lead to an increase in Yi 
since by our conventions, we do not leave a ghost in its place. On the other 
hand, if X2 = o"*(xi) then the newly formed red cycle will stay on as a red 
or a ghost cycle in the next transpositions of the application of the cycle c. 
Whether it stays as a ghost or a red cycle does not change the value of Yi, 
and therefore, this event leads to a net increase of 5^ by 1. This is true for 
all of the first k — 2 transpositions of the /c-cycle c, but not for the last 
one, where both x^ = (j*(xfc_i) and x^ = (T~*(xfc_i) will create a red cycle 
of size i. It follows from this analysis that the total rate X{t) at which Yi 
increases by 1 satisfies 

(13) \{t) < A+ = — ^— r + 



n—k+1 n— n—k+l 

To get a lower bound, observe that for t < t, 6{t) > 1 — K{logn)^ /n at the be- 
ginning of the step. When a /c-cycle is applied and we decompose it into k — 1 
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elementary transpositions, the value 9{t) for each of the transpositions may 
take different successive values which we denote by 9{t,j),j = l,...,k — l. 
However, note that at each such transposition, 9 can only change by at 
most ziz2K/n. Thus it is also the case that for all 1 < j < k — 1, 9{t,j) > 
1 — 2{k — l)K{logn)^ /n. Therefore, the probability that a fragmentation of 
a blue cycle does not create any small cycle is also bounded below by 

l-2{k- l)K{lognf/n - 2K{\ognf/n = 1 - 2kK {log nf /n =: 9^{t). 

It thus follows that the total rate \{t) is bounded below by 

k-2\ k( K{\ognf\ 

(14 At > e^S^ - + > - 1 - %k ^ ^ ' \=:\- . 

\n n J n \ n J 

Of course, by this we mean that the Yi[t) are nonnegative jump processes 
whose jumps are of size ±1, and that if Tt is the filtration generated by the 
entire process up to time t, then 

,__^F(r.(. + /.) = . + l|J-..F.W=.)^ A-<A(t)<A+ 

h-+o+ h 

(15) 

almost surely on the event {t < t}. As for negative jumps, we have that for 

(16) p(y.(« + /.)^.-i|7-,.r.(t) = .)^ 

h->0+ h 

where //(t) depends on the partition and satisfies the estimates 

(17) < ^Ji{t) < ^J+ , 

where 

kf K{\ognf\ , , k 

(18) /i~ :=- l-8fc ^ ^ ' and /i+ - 



n\ n J n — k 

The reason for this is as follows. To decrease 1^ by 1 by decreasing Ri, note 
that the only way to get rid of a red cycle without creating a ghost is to 
coagulate it with a blue cycle at the jth transposition, 1 < j < k — 1, with 
no other transpositions creating small cycles. The probability of this event 
is bounded above by ik/{n — k) and, with 9- as above, bounded below by 

!^0fc-2 + e^9^S^ + 99^^91-' + ■■■ + 99l-^ > -9^:~\ 

n n—1 n—2 n—k+\ n 

Therefore, if in addition ghosts are each killed independently with rate ^{t) 
as above, then (16) holds. More generally, if 1 < m < K and ii < ■ ■ ■ < 
im ^ K are pairwise distinct integers, then we may consider the vector 
{Yi-^ (t), . . . , Yi^(t)). If its current state is x = (xi, . . . , Xm), then it may make 
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transitions to x' = {x'^, . . . ,x'^) where the two vectors x and x' differ by 
exactly one coordinate (say the jth one) and Xj — x'j = ±1 (since only one 
queue Yi can change at any time step, thanks to our coloring rules). Also, 
writing Y(t) for the vector (Yij^{t), . . . ,Yi^^{t)), we find 

^.^ F{Y{t + h)=x'\Tt,Y = x) _ ( A(t), if x'j = Xj + 1, 

ft ™+ h ~ \ ijXjfj,{t), if x'j = Xj - 1. 

These observations show that we can compare {{Yi{t Ar)i<j<x,t > 0} to 
a system of independent Markov queues {{Y^{t A T))i<j</^,t > 0} with re- 
spect to a common filtration J-^, with no simultaneous jumps almost surely, 
and such that the arrival rate of each Yi is A"*" , and the departure rate of each 
client in Yi is ifj,~ . We may also define a system of queues (5^~)i<i<_ft' by ac- 
cepting every new client of Y^^ with probability A~/A^ and rejecting it oth- 
erwise. Subsequently, each accepted client tries to depart at a rate fi'^ — fi~ , 
or when it departs in Y^^ , whichever comes first. Then one can construct all 
three processes {Yi~)i<i<K , {Yi)i<i<K and (^j^)i<i<A' on a common prob- 
ability space in such a way that Y^~{t) < Yi{t) < Y-'^{t) for ah t <t. 

Note that if {Z^)Ki<K denote independent Poisson random variables 
with mean then {Z^)i<i<K forms an invariant distribution for 

the system {Y^{t),t > 0)i<i<K- Let {Z^{t),t > 0)i<i<K denote the system 
of Markov queues Y-'^ started from its equilibrium distribution {Z^)i<i<K ■ 
Then (Y^^ {t))i<i<K and {Z^ {t))i<,i<K can be coupled as usual by taking 
each coordinate to be equal after the first time that they coincide. In par- 
ticular, once all the initial customers of Y^^ and of Z^{t) have departed (let 
us call r' this time), then the two processes (Y^)i<i<K and {Z^)i<i<K are 
identical. 

We now check that this happens before t = t{n) with high probability. 
It is an easy exercise to check this for Zf{t) so we focus on Y^{t). To 
see this, note that by (9), there are no more than D\og{j + 2) customers 
in every strip ,2^^^) initially if j < log2log7i. Moreover, each customer 
departs with rate at least 2^~^ /n when in this strip. Thus the time rj it takes 
for all initial customers of Y^ in strip [1^ ,2^^^) to depart is dominated by 
{n/2^~^) maxi<g<£)iog(j_|_2) Eq, where {Eq)q>i is a collection of i.i.d. standard 
exponential random variables. Hence 

lE(rj) < —2 (logs D + loglog(j + 4)). 

For larger strips we use the crude and obvious bound Mj(0) < n if j > 
logslogn. Moreover, each customer departs at rate 2^~^/n with j > 
[logslognj. Thus, in distribution, 

Ta ~< . „ max E„ 

J - 2^-2 l<q<n " 

SO that lE(Tj) < nlogn/2''~-'^ [we are using here that E(maxi<q<m -Eg) < 
21ogm for all m large enough]. Since we obviously have r' < Yl^j'=o^~^^ '^'j^ 



12 N. BERESTYCKI, O. SCHRAMM AND O. ZEITOUNI 

we conclude 

logjlogn 

^(^')^ E ^aog^ + loglog(i + 4))+ Yl '^<<D)n, 

j=0 i>log2logn 

where a{D) < oo depends solely on D. By Markov's inequality and since 
t{n)/n —7- oo, we conclude that t' <t with high probability. We now claim 
that {Y^ {t))i<i<K = {Yj^ {t))i<i<K with high probability. To see this, we 
note that at equilibrium ¥,{Z^) = < 2/i. Therefore, 

P(y.+ (t) / Yr{t) for some l<i<K) 

<^{ll y + it) - Y- (t) ; r' < t j + P(r' > t) 



K 



1=1 



<i6(t-i)^fl5i!^ + P(r'>t), 

n 

Since we have already checked that P(r' > t) — )■ as n — )• cxd, this shows 
that on the event {t' <t< r} and {Y^{t) = Yr {t) for all 1 < i < K} (an 
event of probability asymptotically one), (l^i(t))i<i<x can be coupled to 
{Zf {t))i<i<K which has the same law as {Z^)i<i<K ■ Thus 

(19) m)i,-{zt)i,\\^() 

as n — 7- oo. On the other hand, we claim that 

||(Z,)iIi-(Z+)£il|^0 

also. Indeed, it is easy to see and well known that for q,/3 > 

||Po(a) - Po(/?)|| < 1 - exp(-|a - /5|) < |a - /?|. 

Since the coordinates of Zi and Z^ are both independent Poisson random 
variables but with different parameters, we find that 

\\(z.)i.-izr)U\<j:§-\ 



^ 1 / 1 \ 



K 

< 

1=1 



^ 4(fc-l)K(lognf 
~ n 

as N ^ oo. By the triangle inequality and (19), this completes the proof of 
Lemma 5. □ 
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Lemma 6. Let t = t{n) he as in Proposition 3. Then, with probability 
tending to 1 as n ^ oo, Bi{t) = for all 1 <i < K . 



Proof. Let us consider black cycles in scale j, that is, those whose size i 
satisfies 2^ <i < 2^~^^ with j < log2 K. By assumption (8), before time t the 
total mass of small cycles never exceeds 2K(logn)^ with high probability. 
Thus the rate at which a black cycle in scale j is generated by fragmentation 
of a red cycle (or from another black cycle) is at most 

^B,i_j 2K{lognf2^+' 
^ n n 

Black cycles can also be generated directly by fragmenting a blue cycle and 
subsequently fragmenting either the small cycle thus created or some other 
blue cycle in the rest of the step. The rate at which a black fragment in 
scale j occurs in this fashion is thus smaller than 

n n 

Finally, one needs to deal with black cycles that arise through the frag- 
mentation of a blue cycle whose size at the time of the fragmentation is 
between K and 2K (thus potentially leaving two small cycles instead of 
one). Let j' = log2i^. We know that, while s < r, Mj/{s) < (logn)^/2. In 
between steps, the number of cycles in scale j' cannot ever increase by more 
than 2k. Thus the rate at which black cycles occur in this fashion at scale j 
is at most 

f 0, ifj</-l, 
\ n n 

This combined rate is therefore smaller than = 3Aj ' . Note that it may 
be the case that several black cycles are produced in one step, although this 
number may not exceed 2k. On the other hand, every black cycle departs 
at a rate which is at least 

■' n n 

since 9 >l/2 for t <t, say. (Note that when two back cycles coalesce, the 
new black cycle has an even greater departure rate than either piece before 
the coalescence, so ignoring these events can only increase stochastically the 
total number of black cycles.) Thus we see that the number of black cycles 
in this scale is dominated by a Markov chain {f3j{s),s > 0) where the rate of 



jumps from x to X + 2A; is and the rate of jumps from x to x — 1 is and 
/3j(0) = 0. Speeding up time hy n/2^~^ , 13 j becomes a Markov chain (3j whose 
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rates are, respectively, X'^ = QkKilogn)^ /n and 1, and where /3j(0) = 0. We 
are interested in 

P(/3j(t) > 0) = ¥{p'j{t') > 0) where t' = tl^'^ jn. 

Note that when there is a jump of size 2k (i.e., when 2/c individuals are 
born) the time it takes for them to all die in this new time-scale is a random 
variable E which has the same distribution as = maxi<j<2fc where 
{Ej)i<:j are i.i.d. standard exponential random variables. Decomposing on 
possible birth times of individuals, and noting that ¥{E > x) < 2ke~^ by 
a simple union bound, we see that 

ft' 

3'At')>0)= / X'fF{E>t'-s)ds 
Jo 



n Jq n 

There are log2 K possible scales to sum on, so by a union bound the proba- 
bility that there is any black cycle at time t is, for large n, smaller than or 
equal to k'^K{\ogn)^ /n^n^oo 0. □ 

The case of ghost particles is treated as follows. 

Lemma 7. Let t = t{n) be as in Proposition 3. Then, with probability 
tending to 1 as n ^ oo, Gi{t) = for all 1 <i < K . 

Proof. Suppose a red cycle is created, and consider what happens to 
it the next time it is touched. With probability at least 9^~'^ this will be to 
coagulate with a blue cycle with no other small cycle being touched in that 
step, in which case this cycle is not transformed into a ghost. However, in 
other cases it might become a ghost. It follows that any given cycle in Yi is 
in fact a ghost with probability at most 

1-^ ,, i^(logn)6 

It follows that (using the notation from Lemma 5) 

K 

¥{Gi{t) > for some i) < ^E(Gi(t); r' <t) + P(r' > t) 

i=l 

, , A 2 (A;-2)K(logn)6 
<¥{t' >t) + ^ ^ ; V & ; 



^-^ I n 

i=l 

7 



< P(t' > t) + 2(/c - 2) ^ ^ ' 



n 

which tends to as n — )■ oo. This completes the proof of Lemma 7. □ 
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Completion of the proof of Proposition 3: Since Ni{t) =Yi — Gi + Bi, we 
get the proposition by combining Lemmas 5, 6 and 7. □ 

2.1. Verification of (8) and (9). In order for Proposition 3 to be useful, 
we need to show that assumptions (8) and (9) indeed hold with large enough 
probability. This will be accomplished in Propositions 11 and 16 below. 

Recall the variable Mj [see (6)], and let 

^• = 1 max Mj{t) <n2~^ /{\ognf\. 

I- iG[sn log log n,n log n] J 

Recall that K is the dyadic integer closest to [n^J . 

We begin with the following lemma. Its proof is a warm-up to the subse- 
quent analysis. 

Lemma 8. Let 

log2 K+l 

Then, 

^ n— >oo 

Proof. It is convenient to reformulate the cycle chain as a chain that at 
independent exponential times (with parameter k), makes a random trans- 
position, where the ^th transposition is chosen uniformly at random (if £ — 1 
is an integer multiple of A:), or uniformly among those transpositions that 
involve the ending point of the previous transposition and that would result 
with a legitimate /c-cycle (i.e., no repetitions are allowed) if ^ — 1 is not an 
integer multiple of k. 

We begin with j = 0. Note that Mo(0) < n and that Mo(t) decreases by 1 
with rate at least kMo{t)n~^ and increases, at most by 2, with rate bounded 
above by k{l — MQ{t)/n)n~^ . In particular, by time nlogn, the number of 
increase events is dominated by twice a Poisson variable of parameter klogn. 
Thus, with probability bounded below by 1 — e~^^°^^^ , at most 2(logn)^ 
parts of size 1 have been born. On this event, Mo{t) < 2(logn)^ + Mo{t) 
where Mo(t) is a process with death only at rate fcMo(t)/n. In particular, 
the time of the n — n/2(logn)^th death in Afo(t) is distributed like the 
random variable 

n— n./2(log n)'^ 
Zo := ^ Si, 

where the £{ are independent exponential random variables of parameter 
k{n — i)/n. It follows that ]E(Zo) ~ 3nloglogn/A; and the Chebyshev bound 
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gives, with ( > 0, 

P(Zo > 2E(Zo)) < E(e'^^o)e~2CiE^" 

< e-Er=o"'""°'"'' log(l-Cn/fc(n-i))g-6Cnloglogn/fe 

< g-lg-cn/(logn)^ 

for an appropriate constant c, by choosing ^ = /c/2(logn)^. We thus conclude 
that 

c I U — 

We continue on the event . We consider the process Mi (t) = Mi {t + 
6nfoglogn//c). By definition Mi(0) < n/2. The difference in the analysis 
of Mi{t) and Mo(t) lies in the fact that now, Mi{t) may increase due to 
a merging of two parts of size 1 , and the departure rate is now bounded below 
by 2kMi{t)n~^ . Note that by time nlogn, the total number of arrivals due to 
a merging of parts of size 1 has mean bounded by nlogn • A;(l/(log 
A;n/(logn)^. Repeating the analysis concerning Mq, we conclude similarly 
that 

The analysis concerning Mj{t) proceeds with one important difference. 
Let Sj = 6X11=0 2"V^, Tj = s^nloglogn, and set Mj(t) = Mj{t + Tj_i). Now, 
Mj{t) can increase due to the merging of a part of size [2^~^n,2^n) with 
a part of size smaller than 2-^n. On flto -^i'' ^^^^ ^^^^ bounded above 
by 

(logn)^ (logn)^ (logn)^ 

One can bound brutally the total number of such arrivals, but such a bound 
is not useful. Instead, we use the definition of the events Al\ that allow 
one to control the number of arrivals "from below." Indeed, note that the 
rate of departures Dt is bounded below by k2^Mj(t) — — l/(logn)^)/n 

(because the total mass below 2^ at times t G [Tj, nlogn] is, on 0^=0"^^' 
bounded above by jn/(logn)^ < n/(logn)^). Thus, when Mj{t) > n2~^~^/ 
(logn)^, the rate of departure Dt » ^(k^J^- Analyzing this simple birth- 
death chain, one concludes that 

Since Tj < 12nloglogn/A; < Gnloglogn, this completes the proof. □ 

An important corollary is the following control on the total mass of large 
parts. 
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Corollary 9. Let m^{t) = X]i>nx Ni{t). Then, 

lim P( min my{t)<n[l — - 

n-s-oo yte[6n log log n.,n log n] \ (log n 



The next step is the following. 

Lemma 10. SetBj = {ii\KKt(zyk-^n{\ogn-\og\ogn~i),n\ogn)Mj(t) < (logn)6/2}. 
Then, 

/21og2(logn) 

lim P 11 5' 

3=0 



n—^oo 




The proof of Lemma 10, while conceptually simple, requires the introduc- 
tion of some machinery and thus is deferred to the end of this subsection. 
Equipped with Lemma 10, we can complete the proof of the following propo- 
sition. 

Proposition 11. With notation as above, 

lim P( max ^°max > (logn)^/2 ) = 0. 

n— >oo ytg[A:-ln(logn— loglogn),nlogTi] j=0 ) 

Proof. Let R = R{n) = 21og2(logn). Because of Lemma 10, it is enough 
to consider Mj{t) for j > R. 

We begin by considering M/j_|_i(i). Let Br denote the intersection of 
Plj^Q^j with the complement of the event inside the probability in Corol- 
lary 9. On the event Bji, for t > A;~^n[logn — log log n — 1] := Tr , the rate 
of arrivals due to merging of parts smaller than 2^ is bounded above by 
k{2^{log{n))^ /n)"^ . The rate of arrivals due to parts larger than 2^ is bounded 
above by k{2^/n), and the jump is no more than 2. Thus, the total rate 
of arrival is bounded above by k2^~^^/n. The rate of departure on the 
other hand is, due to Corollary 9, bounded below by kMji^i{t)2^ /n ■ (1 — 
l/(logn)^). Thus, for Mji^i{t) > logn/2, the difference between the depar- 
ture rate and the arrival rate is bounded below by kMji^i{t)2^ /2n. By 
definition, Mr+i{Tr) < n2-^. Define Tr+i = Tr + nlogn2-^. Let Cr+i = 
{max^g['r^^^ „iog„] Mij+i(t) < logn}. Then, reasoning as in the proof of Lem- 
ma 8, we find that 



P(4^i|S^)<e-('°s") 

Let Brj^i = Br n Cij+i. 
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One proceeds by induction. Letting Tr+j = Tj^^j^i + nlogn2^^^^^^ , 
Cr+j = {'on.a^te[TR+j,niogn] MR+j{t) < logn} and Br+j = BR+j^iHCR+j, we 
obtain from the same analysis that for j = 1, . . . , log(A') + 1, 

p(4+,+lli3i?,+i)<e~(l°^«)^ 

Thus, P(4+iog(/^)+i) < + (logn)e-(i°g")' ^„^oc 0, while 

^_R+iog(i^)+i < A:-in[logn - log logn - 1 + 2-^lognJ2j>i'^~^]- This com- 
pletes the proof, since 2^ = (logn)^. □ 



2.2. Proof of Lemma 10. While a proof could be given in the spirit of 
the proof of Lemma 8, we prefer to present a conceptually simple proof 
based on comparison with the random A;-regular hypergraph. This coupling 
is analog to the usual coupling with an Erdos-Renyi random graph (see, 
e.g., [5] and [20]). Toward this end, we need the following definitions. 

Definition 12. A k-regular hypergraph is a pair G = {V,H) where V is 
a (finite) collection of vertices, and if is a collection of subsets of V of size k. 
The random hypergraph Gk{n,p) is defined as the hypergraph consisting of 
V = {1, . . . , n}, with each subset h oiV with \h\ =k taken independently to 
belong to Gk{n,p) with probability p. 



Let Gt denote the random fc-hypergraph obtained by taking V = {1, . . . ,n} 
and taking H to consist of the fc-hyperedges corresponding to the /c-cycles 
7i, . . . ,77Vi of the random walk TTt- It is immediate to check that Gt is dis- 
tributed like Gk{n,pt) with 



Pt = l- exp 



t \ kit 



Definition 13. A k-hypertree with h hyperedges in a /c-regular hyper- 
graph G is a connected component of G with i = {k — l)h + 1 vertices. 

(Pictorially, a /c-hypertree corresponds to a standard tree with hyperedges, 
where any two hyperedges have at most one vertex in common.) /c-hypertrees 
can be easily enumerated, as in the following, which is Lemma 1 of [10]. 

Lemma 14. The number of k-hypertrees with i (labeled) vertices is 
where h is the number of hyperedges and thus i = {k — l)h + 1. 
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The next lemma controls the number of fc-hypertrees with a prescribed 
number of edges in Gt- 

Lemma 15. Let 

T^t,h = {# of k-hypertrees with <h hyperedges in Gt 

is not larger than (logn)^'^}. 

Then, 

t>{n/k) [log n— log log n— 1] 



Proof. Let tQ = k ^n[logn — loglogn — 1] and Hq = (logn)^. By mono- 
tonicity, it is enough to check that 

(22) mtoAo) 1- 

n— >-oo 

Note that, with i = [k — l)h + 1, and adopting as a convention hlogh = 
when h = 0, 

(log 71 J"^ 

^.„C N ^ E(# of fc-hypertrees with h hyperedges in GfJ 

n^^toM)^ (logn)i-i 

(23) < ' h\ iik-mn^-^ )(D-+4"--D 

^ > - (log n) 1-1 ^ VV h\{{k - ^'^^^ P'^' 

(logn)2 

<Gk y (logn)^+'^-i-ie-('=-^)'^(i°s"-i°s^('=-^» ^ 0. 

h=0 

[Indeed recall that if T is a subset of {l,...,n} comprising i elements, 
then disconnecting T from the rest of {l,...,n} requires closing exactly 

(D (ri) + Q (n) + • • • + ik-i) iV) > ^(ri) hyperedges, while Q - h 
is the number of hyperedges that need to be closed inside T for it to be 
a hypertree.] □ 

We can now provide the following proof: 

Proof of Lemma 10. At time t, Ni{t) consists of cycles that have 
been obtained from the coagulation of cycles that have never fragmented 
during the evolution by time t, denoted N^(t), and of cycles that have been 
obtained from cycles that have fragmented and created a part of size less 
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than or equal to i, denoted A^^^- (t). Note that N?[t) is dominated above by 
the number of fc-hypertrees with h edges in Gt, where i = [k — l)h + 1. By 
Lemma 15, this is bounded above by (logn)^'^ with high probability for all 
i < (logn)^. On the other hand, the rate of creation by fragmentation of 
cycles of size i is bounded above by Ak/n, and hence by time nlogn, with 
probability approaching 1 no more than (logn)^'^ cycles of size i have been 
created, for all i < (logn)^. We thus conclude that with probability tending 
to 1, we have, with Iq = k~^n[logn — log log n — 1], 

max max N- (t) < (log n) ' . 

i<{logn)'^ tg[to, nlogn] 

This yields the lemma, since for j < 2 log2 (log n) , 

Mj(t) < (logn)^ max Ni{t). „ 

i<(logn)2 

2.3. Proof of (9). We now prove that at time tmix = (l/A:)nlogn, the 
assumption (9) [with -/Vfj(0) replaced by Mj{t'^\-^] is satisfied, with high 
probability. 

Proposition 16. For every e > there exist D = D{e) > and uq = 
no(e) such that for n> hq, 

P(M,(W) < L>log(2 + j), j = 0, 1, . . . ,log2logn + 1) > (1 - e). 

Proof. Consider first the time u = ^ (nlogn — n log log n). 

Lemma 17. With probability approaching 1 as oo, we have Mj{u) < 
2^~^^ log n for all < j < log2 n. 

Proof. As in the proof of Lemma 10, split Mj{t) into two compo- 
nents Mj (t) and Mj{t). Note that the rate at which a fragment of size less 
than 2^~^^ is produced is smaller than 2^~^'^k/n, so for any w < (l//c)nlogn, 
Mj (w) < Poisson(2-'"''^ logn). The probability that such a Poisson random 
variable is more than twice its expectation is (by standard large deviation 
bounds) smaller than n~" for some a > 0, so summing over log2logn values 
of j we easily obtain that with high probability, Mj (u) < 2-^+^ log n for all 
<i < log2 logn. 

It remains to show that Mj{u) < logn for all < j < log2logn with high 
probability. To deal with this part, note that if denotes the number of 
hypertrees with h hyperedges in Gu, then Nf{u) < Th where i = l + h{k — 1) 
is the number of vertices. Reasoning as in (23), we compute after simplifi- 
cations [recalling that u = {l/k){n\ogn — nloglogn) and i = \ + h{k — 1)], 
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for /i > 
(24) 

nh nh 

Thus summing over i = 2 to i = [log n] , we conclude by Markov's inequality 
that Mj{u) = for all 1 < j < log2logn with high probability. For i = 1 or 
h = 0, we get from (24) 

E(ro)<logn. 

Computing the variance is easy: writing Tq = Xl^gy ^{f is isolated}, we get 
var(ro) < E(ro) + cov(l 

{d is isolated}! -^{tu is isolated})- 

But note that 

TTT,/ ■ ■ 1 1 ■ ■ , l^ is isolated)^ 

r[v IS isolated, w is isolated) = , 

1 -P« 

so 

var(To) < E(ro) + E{Tof ( < E{Th) + o(l). 

Thus by Chebyshev's inequality, P(Mq(u) > 21ogn) — )• as n — )• oo. This 
proves the lemma. □ 

With this lemma we now complete the proof of Proposition 16. We com- 
pare {Mj{t),t'>u) to independent queues as follows. By Proposition 11, 
on an event of high probability, during the interval [u,tmix] the rate at 
which some two cycles of size smaller than logn coagulate is smaller than 
0(((logn)^/n)^), so the probability that this happens during this interval of 
time is o(l). Likewise, the rate at which some cluster smaller than logn will 
fragment is at most A:(logn)^^/n^, so the probability that this happens dur- 
ing the interval [n,tinix] is o(l). Now, aside from rejecting any A;-cycle that 
would create such a transition, the only possible transition for Mj are in- 
creases by 1 (through the fragmentation of a component larger than 2 logn) 
and decreases by 1 (through coagulation with cycle larger than logn). The 
respective rates of these transitions is, as in (13), at most 2-' A"*" = 2^k/ {n — k), 
and at least u = 2^[k/n){l — (log 7i)'^/n)) as in (18). This can be compared to 
a queue where both the departure rate and the arrival rate are equal to , 
say Mj{t). The difference between Mj[t) and Mj{t) is that some of the cus- 
tomers having left in Mj{t) might not have left yet in Mj{t). Excluding 
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the initial customers, a total of Poisson(2-' log log n) customers arrive in the 
queue Mj{t) during the interval [wjtmix], so the probability that any one of 
those customers has not yet left by time tmix in [t) given that it did leave 
in Mjit) is no more than /u — 1 = 0((logn)^/n), where the constants 
implicit in O(-) do not depend on j or n. Thus with probability greater 
than 1 — 0(2-5' log log n(log?i)^/n), there is no difference between Mj{t^a\x) 
and Mj(tmix). Moreover, 

(25) Mj(tmix) ^ Poisson(l) + Rj, 

where Rj is the total number of initial customers customers that have not 
departed yet by time tmix- Using Lemma 17, 

(26) {flj>0}c|-^ max Eq<t^A, 

where {Eq,q > 1) is a collection of i.i.d. standard exponential random vari- 
ables. Using the independence of the queues Mj{t), in combination with (25) 
and (26) as well as standard large deviations for Poisson random variables, 
the proposition follows immediately. □ 

2.4. Conclusion: Small cycles. Combining Propositions 3 and 11, and 
using the notation introduced in the beginning of this section, we have 
proved the following. Fix e > 0. Then there is a c^jt > such that with 
t = t{n) = k~^nlogn + Ce^^n, and all large n, 

(27) \m{t))l,-{Z,)l,\\<e. 
We now deduce the following: 



Proposition 18. Fix e > 0. Then there is a c^^k > such that with 
t = t{n) = k~^nlogn + c^^kn, and all large n, 



(28) \mit))t,-iN,)t,\\<e, 

where (A^i)i<t<n is the cycle distribution of a random permutation sampled 
according to the invariant distribution fj,. 



Proof. By (27) and the triangle inequality, all that is needed is to show 
that 



(29) ||(Z,)f=i-(iV.)f=ill^O. 

Whenever k is even, and thus /i is uniform on 5„, (29) is a classical result 
of Diaconis-Pitman and of Barbour, with explicit upper bound of AK/n 
(see [4] or the discussions around [3], Theorem 2, and [2], Theorem 4.18). 
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In case k is odd, fi is uniform on An- A sample 7 from fi can be obtained 
from a sample 7' of the uniform measure on Sn using the following procedure. 
If 7' is even, take 7 = 7', otherwise let 7 = tt o 7' where tt is some fixed 
transposition [say (12)]. The probability that the collection of small cycles 
in 7 differs from the corresponding one in 7' is bounded above by AK/n — )• 0, 
which completes the proof. □ 

3. Large cycles and Schramm's coupling. Fix e > and x ^ (7/8,1)- 
Recall that K is the closest dyadic integer to [n-^J and that a cycle is called 
small if its size is smaller than K. For n large, let t = t{n) = k~^n\ogn + 
Ce,k'n- We know by the previous section (see Proposition 18) that at this 
time, for n large, the distribution of the small cycles of the permutation vrt is 
arbitrarily close (variational distance smaller than e) to that of a (uniformly 
chosen) random permutation vr'. Therefore we can find a coupling of vr := vtj 
and tt' in such a way that 

(30) P(the small cycles of vr and vr' are identical) > 1 — e. 

We can now provide the following proof: 

Proof of Theorem 1. We will construct an evolution of vr', denoted vr^, 
that follows the random /c-cycle dynamic (and hence, vr^ has cycle structure 
whose law coincides with the law of the cycle structure of a uniformly cho- 
sen permutation, at all times). The idea is that with small cycles being the 
hardest to mix, coupling Tit+s and vr^ will now take very little time. To prove 
this, we describe a modified version of the Schramm coupling introduced 
in [20], which has the additional property that it is difficult to create small 
unmatched pieces. 

To describe this coupling, we will need some notation from [20]. Let 0.^ 
be the set of discrete partitions of unity 

= \ {xi > • ■ > Xn) '■ Xi G {0/n, . . . ,n/n} for all l<i<n, and = 1 > . 

We identify the cycle count of vrj with a vector Yt € r^n- We thus want to 
describe a coupling between two processes Yt and Zt taking their values 
in and started from some arbitrary initial states. The coupling will be 
described by a joint Markovian evolution of {Yt,Zt). 

We now begin by describing the construction of a random transposition. 
For X G (0,1), let {x}n denote the smallest element of {1/n, . . . , n/n} not 
smaller than x. Let tt, {j be two random points uniformly distributed in (0, 1), 
set u = {tt}ni V = {v}n and condition them so that u^v. Note that u, v are 
both uniformly distributed on {1/n, n/n}. If we focus for one moment 
on the marginal evolution of (Yt), then applying one transposition to Yt can 
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be realized by associating to Yt £ a tiling of the semi-open interval (0, 1] 
where each tile is equally semi-open and there is exactly one tile for each 
nonzero coordinate of Yf. (The order in which those tiles are put down may 
be chosen arbitrarily and does not matter for the moment.) If u and v fall 
in different tiles then we merge the two tiles together and get a new element 
of J7„ by sorting in decreasing order the size of the tiles. If u and v fall in the 
same tile then we use the location of v to split that tile into two parts: one 
that is to the left of v, and one that is to its right (we keep the same semi- 
open convention for every tile). This procedure works because, conditionally 
on falling in the same tile C as u, then v is equally likely to be on any point 
of C n {1/n, n/n} distinct from v, which is the same fragmenting rule as 
explained at the beginning of the proof of Proposition 3. 

We now explain how to construct one step of the joint evolution. If 
Y,Z€ are two unit discrete partitions, then we can differentiate between 
the entries that are matched and those that are unmatched; two entries 
from Y and Z are matched if they are of identical size. Our goal will be to 
create as many matched parts as possible. Let Q be the total mass of the 
unmatched parts. When putting down the tilings associated with Y and Z 
we will do so in such a way that all matched parts are at the right of the 
interval (0, 1] and the unmatched parts occupy the left part of the interval, 
as in Figure 1. If ti falls into the matched parts, we do not change the cou- 
pling beyond that described in [20]; that is, if v falls in the same component 
as u we make the same fragmentation in both copies, while otherwise we 
make the corresponding coalescence. The difference occurs if u falls in the 
unmatched parts. Let y and z be the respective components of Y and Z 
where u falls, and let Y, Z be the reordering of Y, Z in which these compo- 
nents have been put to the left of the interval (0, 1]. Let a = |y| and let b= \z\ 
be the respective lengths of the pieces selected with u, and assume without 
loss of generality that a <b. Further rearrange, if needed, y and z so that 
after the rearrangement, \u\ = 1/n. Because v ^ u, necessarily v > 1/n (and 
is uniformly distributed on the set {2/n, . . . ,n/n}). The point v designates 
a size-biased sample from the partition Y and we will construct another 
point v' , which will also be uniformly distributed on {2/n, . . . ,n/n}, to sim- 
ilarly select a size-biased sample from Z. However, while in the coupling 
of [20] one takes v = v' , here we do not take them equal and apply to v 
a measure-preserving map <I>, defined as follows. Define the function 



where 7^ := {(a — l/ri)/2}„. See Figure 2 for description of Note that $ 
is a measure-preserving map and hence v' := ^{v) is uniformly distributed 



(31) 




X + b — a, 



ii X > b or if 1/ri < x < 7„ + 1/ra, 

if a < X <b, 

ii + 1/ n < X < a, 



MIXING TIMES FOR RANDOM iC-CYCLESS 



25 



u 

matched 



Y 



Z 



Y 



Z 



Fig. 1. First step of the coupling. A point u is uniformly chosen on (0,1) and picks 
a part in Y and Z, which are then rearranged into Y,Z. 

on (0,1). Define v' = {v'}n. With u,v and v' selected, the rest of the algo- 
rithm is unchanged, that is, we make the corresponding coagulations and 
fragmentations . 

This coupling has a number of remarkable properties which we summarize 
below. Essentially, the total number of unmatched entries can only decrease, 
and furthermore it is very difficult to create small unmatched entries, as 
the smallest unmatched entry can only become smaller by a factor of at 
most 2. 

In what follows, we often speak of the "unmatched entries" between two 
permutations, meaning that we associate to these permutations elements 
of fin and identify matched parts in with matched cycles in the per- 
mutations. The translation between the two involves a factor n concerning 
the size of the parts, and in all places it should be clear from the context 
whether we discuss parts in Qn or cycles of partitions. 

Lemma 19. Let U be the size of the smallest unmatched entry in two par- 
titions Y,Z£ Qn, 1st Y' , Z' be the corresponding partitions after one transpo- 
sition of the coupling and let U' be the size of the smallest unmatched entry 
in Y',Z'. Assume that 2^ <U < 2^~^^ for some j > 0. Then it is always the 
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Fig. 2. A second point v is chosen uniformly in (0,1) and serves as a second size-biased 
pick for Y . V is mapped to v' = $(5) which gives a second size-biased pick for Z . 

case that U' >U — {[//2}.„, and moreover, 

Finally, the number of unmatched parts may only decrease. 

Remark 20. Since U'>U- {C//2}„, it holds in particular that U' > 2^''^. 

Proof of Lemma 19. That the number of unmatched entries can only 
decrease is similar to the proof of Lemma 3.1 in [20]. (In fact it is simpler 
here, since that lemma requires looking at the total number of unmatched 
entries of size greater than e. Since in our discrete setup no entry can be 
smaller than e = 1/n we do not have to take this precaution.) We continue to 
denote by Mj the total number of parts in the range [2-' , 2-'"'"^)/n. The only 
case that U can decrease is if there is a fragmentation of an unmatched entry, 
since matched entries must fragment in exactly the same way. Now, note that 
the coupling is such that when an unmatched entry is selected and is frag- 
mented, then all subsequent pieces are either greater or equal to a — {a/2}„ 
(where a is the size of the smaller of the two selected unmatched entries), or 
are matched. Moreover, for such a fragmentation to occur, one must select 
the lowest unmatched entry (this has probability at most Mj2^~^^ /n, since 
there may be several unmatched entries with size U), and then fragment 
it, which has probability at most 2-'+^/n, and thus P(C/' <U) < AMjA^ /n^. 
Since Mj2^ < n, this completes the proof. □ 



We have described the basic step of a (random) transposition in the cou- 
pling. The step corresponding to a random A;-cycle 7 = (71,72, • • • ,7fc) is 
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obtained by taking ui = 71, generating v,v' as in the coupling above (cor- 
responding to the choice of 72), rearranging and taking U2 to correspond to 
the location of v,v' after the rearrangement, drawing new v, v' (correspond- 
ing to 73) and so on. In doing so, we are disregarding the constraint that no 
repetitions are present in 7. However, as it turns out, we will be interested 
in an evolution lasting at most 

(32) A:=n^/^logn, 

and the expected number of times that a violation of this constraint occurs 
during this time is bounded by 2AA:^/n, which converges to as n — )• 00. 
Hence, we can in what follows disregard this violation of the constraint. 

Now, start with two configurations Yq,Zq such that Zq is the element 
of Vln associated with a random uniform permutation. Assume also that 
initially, the small parts of Yq and Zq (i.e., those that are smaller than K, 
the closest dyadic integer to [^^■^J), are exactly identical, and that they have 
the same parity. As we will now see, at time A, vt^+a and vr^ will be coupled, 
with high probability. Note also that, since initially all the parts that are 
smaller than K are matched, the initial number of unmatched entries cannot 
exceed n/K < n^/^, and this may only decrease with time by Lemma 19. 

Lemma 21. In the next A units of time, the random permutation vr^ 
never has more than a fraction n~^/^(logn)^ of the total mass in parts 
smaller than r{Jl^ , with high probability. 

Proof. The proof is the same as that of Proposition 11, only simpler 
because the initial number of small clusters is within the required range. We 
omit further details. [This can also be seen by computing the probability 
that a given uniform permutation vr' has more than a fraction n~^/^(logn)^ 
of the total mass in parts smaller than n'^^^, and summing over Poisson(A) 
steps.] □ 

Lemma 22. In the next A units of time, every unmatched part of the 
permutations is greater than or equal to with high probability. 

Proof. Recall that the total number of unmatched parts can never 
increase. Suppose the smallest unmatched part at time s is of scale j (i.e., 
of size in [2-', 2-'^^)), and let j = U{s) be this scale. Then, when touching 
this part, the smallest scale it could go to is j — 1, by the properties of the 
coupling (see Lemma 19). This happens with probability at most 2^~^'^/n. On 
the other hand, with the complementary probability, this part experiences 
a coagulation. And with reasonable probability, what it coagulates with is 
larger than itself, so that it will jump to scale j ' + 1 or larger. To compute this 
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probability, note that since this is the smallest unmatched part, all smaller 
parts are matched and thus have a total mass controlled by Lemma 21. In 
particular, on an event of high probability, this fraction of the total mass is 
at most q := n~^/^{\ogn)^ . It follows that with probability at least 1 — (7, the 
part jumps to scale at least j + 1, and with probability at most rj := 2^^"^ /n, 
to scale j — 1. Now, when this part jumps to scale at least j + 1, this does 
not necessarily mean that the smallest unmatched part is in scale at least 
J ' + 1, since there may be several small unmatched parts in scale j. However, 
there can never be more than 2n^/^ such parts. If an unmatched piece in 
scale j is touched, we declare it a success if it moves to scale j + 1 (which 
has probability at least 1 — given that it is touched) and a failure if it 
goes to scale j — 1 (which has probability at most Vj). If 2n^/* successes 
occur before any failure occurs at scale j, we say that a good success has 
occurred, and then we know that no unmatched cycle can exist at scale 
smaller than j. Call the complement of a good success a potential failure 
(which thus includes the cases of both a real failure and a success which 
is not good). The probability of a potential failure at scale j is at most 
2v}^^rj/{l — q + Vj), which is bounded above by pj = 6n^^^2^ /n. 

Let {si}i>o be the times at which the smallest unmatched part changes 
scale, with sq being the first time the smallest unmatched part is of scale jo 
where 2-^" = n^/^. Let {Ui} denote the scale of the smallest unmatched part 
at time Si, and let ji be such that 2^^ = n^/^/2. Introduce a birth-death 
chain on the integers, denoted Vn, such that vq = jo and 

{1, ifj=Jo, 
0, ifj=Ji, 
Pj, otherwise, 

and 

(34) P(.„+i =J + l\vn=j) = \l- ^(""^^ = - = > 

[U, J=Jl- 

Set Tj = min{77, > : v„ = j}, and an analysis of the birth-death chain defined 
by (33) and (34) gives that 

io-i 

(see, e.g.. Theorem (3.7) in Chapter 5 of [8]). Thus IP-"'(tj^ < tj^) decays as an 
exponential in (logn)^. Therefore, since P(f2fcA = ji) < 2kAF^°{Tj-^ < tj^), it 
follows that P(f 2A:A = Ji) — as n — 7- 00. On the other hand, between times t 
and t + A, the process {Ui}i>i may have made at most 2kA moves with 
overwhelming probability. This implies that Ui > ji with high probability 
throughout [t,t + A]. □ 
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End of the proof of Theorem 1. We now are going to prove that, after A = 
7^5/8 log n steps, there are no more unmatched parts with high probabiUty. 
The basic idea is that, on the one hand, the number of unmatched parts may 
never increase, and on the other hand, it does decrease frequently enough. 
Since each unmatched part is greater than during this time, any given 

pair of unmatched parts is merging at rate roughly n~^/^. There are initially 
no more than 2n^/^ unmatched parts, so after n^/^ logn = A steps, no more 
unmatched part remains with high probability. 

To be precise, assume that there are L unmatched parts. Let Tl be the 
time to decrease the number of unmatched parts from L to L — 2. Observe 
that, for parity reasons (vr and vr' must have the same parity of number of 
parts at all times), L is always even. Note also that L = 2 is impossible, 
so L is at least 4. Assume to start with that both copies have at least 2 
unmatched parts. Then, at rate greater than n~^/^/2 we pick an unmatched 
part in the first point ui for the fc-cycle. Since there are at least 2 unmatched 
parts in each copy, let R be the interval of (0, 1) corresponding to a second 
unmatched part in the copy that contains the larger of the two selected ones. 
Then \R\ > n'^l'^j^, and moreover when v falls in i?, we are guaranteed 
that a coagulation is going to occur in both copies. We interpret this event 
as a success, and declare every other possibility a failure. Hence if G is 
a geometric random variable with success probability n~^/^/2, and {X^y^^^ 
are i.i.d. exponentials with mean 2n^/'*, the total amount of time before 
a success occurs is dominated by Y^=\^i- 

If, however, one copy (say tt) has only one unmatched part, then one first 
has to break that component, which takes at most an exponential random 
variable with rate n~^/^/4. Note that the other copy must have had at least 3 
unmatched parts, so after breaking the big one, both copies have now at least 
two unmatched copies and we are back to the preceding case. It follows from 
this analysis that in any case, is dominated by 

G 

and so E(rL) < ^n^l'^ + 4ni/2 = 8nV2. ^^^^ 

TL = Tl + Tl-2 + • • • + 14 

and let T = Tg^i/s. Then T is the time to get rid of all unmatched parts. 
We obtain from the above E(T) < 16n^/*. By Markov's inequality, it follows 
that T < n^/^ log n = A with high probability. This concludes the proof of 
Theorem 1. □ 
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